Relative Fisher Information and Natural Gradient for Learning Large Modular Models

نویسندگان

  • Ke Sun
  • Frank Nielsen
چکیده

Fisher information and natural gradient provided deep insights and powerful tools to artificial neural networks. However related analysis becomes more and more difficult as the learner’s structure turns large and complex. This paper makes a preliminary step towards a new direction. We extract a local component from a large neural system, and define its relative Fisher information metric that describes accurately this small component, and is invariant to the other parts of the system. This concept is important because the geometry structure is much simplified and it can be easily applied to guide the learning of neural networks. We provide an analysis on a list of commonly used components, and demonstrate how to use this concept to further improve optimization. 1. Fisher Information Metric The Fisher Information Metric (FIM) I(Θ) = (Iij) of a statistical parametric model p(x |Θ) of order D is defined by a D×D positive semidefinite (psd) matrix (I(Θ) 0) with coefficients Iij = Ep [ ∂l ∂Θi ∂l ∂Θj ] , where l(Θ) denotes the log-density function log p(x |Θ). Under light regularity conditions, FIM can be rewritten equivalently as Iij = −Ep [ ∂l ∂Θi∂Θj ]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relative Natural Gradient for Learning Large Complex Models

Fisher information and natural gradient provided deep insights and powerful tools to artificial neural networks. However related analysis becomes more and more difficult as the learner’s structure turns large and complex. This paper makes a preliminary step towards a new direction. We extract a local component of a large neuron system, and defines its relative Fisher information metric that des...

متن کامل

Adaptive natural gradient learning algorithms for various stochastic models

The natural gradient method has an ideal dynamic behavior which resolves the slow learning speed of the standard gradient descent method caused by plateaus. However, it is required to calculate the Fisher information matrix and its inverse, which makes the implementation of the natural gradient almost impossible. To solve this problem, a preliminary study has been proposed concerning an adaptiv...

متن کامل

Natural Langevin Dynamics for Neural Networks

One way to avoid overfitting in machine learning is to use model parameters distributed according to a Bayesian posterior given the data, rather than the maximum likelihood estimator. Stochastic gradient Langevin dynamics (SGLD) is one algorithm to approximate such Bayesian posteriors for large models and datasets. SGLD is a standard stochastic gradient descent to which is added a controlled am...

متن کامل

True Asymptotic Natural Gradient Optimization

We introduce a simple algorithm, True Asymptotic Natural Gradient Optimization (TANGO), that converges to a true natural gradient descent in the limit of small learning rates, without explicit Fisher matrix estimation. For quadratic models the algorithm is also an instance of averaged stochastic gradient, where the parameter is a moving average of a “fast”, constant-rate gradient descent. TANGO...

متن کامل

LETTER Communicated by Sun - Ichi AmariOn \ Natural " Learning and Pruning in Multilayered PerceptronsTom

Several studies have shown that natural gradient descent for on-line learning is much more eecient than standard gradient descent. In this paper, we derive natural gradients in a slightly diierent manner and discuss implications for batch-mode learning and pruning, linking them to existing algorithms such as Levenberg-Marquardt optimization and optimal brain surgeon. The Fisher matrix plays an ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017